Load the tidyverse package.
library(tidyverse)
## ── Attaching packages ─────────────────────────── tidyverse 1.3.0 ──
## ✓ ggplot2 3.3.2 ✓ purrr 0.3.4
## ✓ tibble 3.0.1 ✓ dplyr 1.0.0
## ✓ tidyr 1.1.0 ✓ stringr 1.4.0
## ✓ readr 1.3.1 ✓ forcats 0.5.0
## Warning: package 'ggplot2' was built under R version 3.6.2
## Warning: package 'tibble' was built under R version 3.6.2
## Warning: package 'tidyr' was built under R version 3.6.2
## Warning: package 'purrr' was built under R version 3.6.2
## Warning: package 'dplyr' was built under R version 3.6.2
## ── Conflicts ────────────────────────────── tidyverse_conflicts() ──
## x dplyr::filter() masks stats::filter()
## x dplyr::lag() masks stats::lag()
data <- read_csv("data/chds6162_data.csv")
## Parsed with column specification:
## cols(
## .default = col_double(),
## drace = col_character()
## )
## See spec(...) for full column specifications.
With the function select we can select variables (columns) from the larger data frame.
Use select to show just the gestation variable.
data %>%
select(gestation)
We can also select a range of columns. select all the variables that belong to the father (they had a “d” in front of them) drace to dwt.
data %>%
select(drace:dwt)
#What about just the id column and everything after the father information?
data %>%
select(id, marital:last_col())
We can drop variables using the -var format. Drop the marital variable.
data %>%
select(-(marital))
art by @allison_horst
We use mutate we make new variables or change existing ones.
Create a new variable with a specific value
Create a new variable called data_decade. Imagine that you will be merging this dataset from 61-62 to dataset from the 70’s. To make it easier, you will create this variable with the value “60s.”
data %>%
mutate(data_decade = "60s")
Create a new variable based on other variables
Create a new variable called wt_k. This variable will give you information about mom’s weight pre-pregnancy(wt) in kilos (1 pound = .454 kilos).
data %>%
mutate(wt_k = wt*.454)
# too many decimals? let's round things
data %>%
mutate(wt_k = round((wt*.454),1))